Named Entity Recognizer employing Multiclass Support Vector Machines for the Development of Question Answering Systems

نویسندگان

  • Sumam Mary Idicula
  • Menno van Zaanen
  • Asif Ekbal
  • Rejwanul Haque
  • Amitava Das
  • Venkateswarlu Poka
  • Dan Roth
  • Sivaji Bandyopadhyay
  • Kashif Riaz
  • Vinaya Babu
  • Mohammad Hasanuzzaman
  • Guy De Pauw
چکیده

Named Entity Recognition (NER) seeks to locate and classify atomic elements in text into predefined categories such as names of person, organization, location, Quantities, Percentage etc. Named entities tell us the roles of each meaning bearing word in a sentence and hence identification of these entities certainly helps us to extract the essence of the text which is very important in Question Answering(QA) , Information Extraction (IE) and Summarization. The system presented here is a Named Entity (NE) Classifier created using Multiclass Support Vector Machines based on linguistic grammar principles. Malayalam NER is a difficult task as each word of named entity has no specific feature such as Capitalization feature in English. NERs in other languages are not suitable for Malayalam language since its morphology, syntax and lexical semantics is different from them. Also there is no tagged corpus available for training. For testing this system, documents from well known Malayalam news papers and magazines containing passages from five different fields such as sports, health, politics, science and agriculture are selected. Experimental results show that the average precision recall and F-measure values are 89.12%, 89.15% and 89.13% respectively. General Terms Natural Language Processing, Support Vector Machines

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Support Vector Classifiers for Named Entity Recognition

Named Entity (NE) recognition is a task in which proper nouns and numerical information are extracted from documents and are classified into categories such as person, organization, and date. It is a key technology of Information Extraction and Open-Domain Question Answering. First, we show that an NE recognizer based on Support Vector Machines (SVMs) gives better scores than conventional syste...

متن کامل

Named Entity Recognition in Persian Text using Deep Learning

Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...

متن کامل

A Tree Kernel approach to Question and Answer Classification in Question Answering Systems

Abstract A critical step in Question Answering design is the definition of the models for question focus identification and answer extraction. In case of factoid questions, we can use a question classifier (trained according to a target taxonomy) and a named entity recognizer. Unfortunately, this latter cannot be applied to generate answers related to non-factoid questions. In this paper, we ta...

متن کامل

Two-Pass Named Entity Classification for Cross Language Question Answering

In this paper, we present the mono-lingual and bilingual question answering experimental results at NTCIR6-CLQA. We combine most of the online resources and available resources to our QA systems without employing additional resources such as ontology, labeled data. Our method relies on three main important components, namely, passage retrieval, question classifier, and the named entity recogniz...

متن کامل

NTT Question Answering System in TREC 2001

In this report, we describe our question-answering system SAIQA-e (System for Advanced Interactive Question Answering in English) which ran the main task of TREC-10’s QA-track. Our system has two characteristics (1) named entity recognition based on support vector machines and (2) heuristic apposition detection. The MPR score of the main task is 0.228 and experimental results indicate the effec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011